Goto

Collaborating Authors

 learning sparse prototype


Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

Prototype-driven text generation uses non-parametric models that first choose from a library of sentence prototypes and then modify the prototype to generate the output text. While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus. Further, existing methods often require heuristics to identify which prototypes to reference at training time. In this paper, we propose a novel generative model that automatically learns a sparse prototype support set that, nonetheless, achieves strong language modeling performance. This is achieved by (1) imposing a sparsity-inducing prior on the prototype selection distribution, and (2) utilizing amortized variational inference to learn a prototype retrieval function. In experiments, our model outperforms previous prototype-driven language models while achieving up to a 1000x memory reduction, as well as a 1000x speed-up at test time. More interestingly, we show that the learned prototypes are able to capture semantics and syntax at different granularity as we vary the sparsity of prototype selection, and that certain sentence attributes can be controlled by specifying the prototype for generation.


Review for NeurIPS paper: Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

Additional Feedback: I enjoyed reading your manuscript, and I find the idea of generating sentences by editing prototypes an exciting direction. In the following I'd like to raise a few points, some are comments, some are clarification questions. 1. If the support of the retrieval component q(t x) is an index set of the training data (as in, a sample t indexes a training instance), you cannot easily change the support at test time, can you? If I am right that you cannot, then you are limited to using the same repository of prototypes during training and test. This in turn means that subsampling the training data, for scalability, also affects the set of prototypes available for test-time generations.


Review for NeurIPS paper: Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

This paper builds upon Guu et al. (2018)'s prototype-driven text generation approach. Two major changes are made: first, modeling a sparse distribution over prototypes with a Dirichlet prior over a multinomial, and second, actually learning this sparse distribution. At training time, the paper uses amortized variational inference, further approximating the gradients using REINFORCE to deal with the large number of prototypes. At inference time, they can keep fewer training examples in memory by filtering only those whose posterior probability is larger than a threshold. Thus both the memory required to store training examples and the time spent on retrieving training examples is reduced.


Learning Sparse Prototypes for Text Generation

Neural Information Processing Systems

Prototype-driven text generation uses non-parametric models that first choose from a library of sentence "prototypes" and then modify the prototype to generate the output text. While effective, these methods are inefficient at test time as a result of needing to store and index the entire training corpus. Further, existing methods often require heuristics to identify which prototypes to reference at training time. In this paper, we propose a novel generative model that automatically learns a sparse prototype support set that, nonetheless, achieves strong language modeling performance. This is achieved by (1) imposing a sparsity-inducing prior on the prototype selection distribution, and (2) utilizing amortized variational inference to learn a prototype retrieval function. In experiments, our model outperforms previous prototype-driven language models while achieving up to a 1000x memory reduction, as well as a 1000x speed-up at test time.